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Modtaget 

CGTASE VARIANTS " 1 ®U 2003 

PVS 

FIELD OF THE INVENTION 

The present invention relates to the construction of variants of cyclodextrin glu- 
canotransferases (CGTases). In particular variants having the ability to form linear oligosaccha- 
5 rides. 

BACKGROUND OF THE INVENTION 

Pdb files 1CDG, 1PAM. 1CYG and 1CIU (available at www.rcsb.orol show the aminn 
acid sequences and three-dimensional structures of several cyclodextrin glucanotransferases 
(CGTases). WO 9943794 shows the amino acid sequence and three-dimensional structure of a 
10 maltogenic alpha-amyiase from Bacillus stearothermophilus, known as Novamyl ®. 

Variants of a cyclodextrin glucanotransferase (CGTase) with the ability to form linear 
oligosaccharides are disclosed in WO 9943793 and in R.J. Leemhuis: "What makes cyclodextrin 
glycosyltransferase a transglycosylase", University Library Groningen, 2003. 

L. Beier et al., Protein Engineering, vol 13, no. 7, pp. 509-513, 2000 is titled "Conver- 
15 sion of the maltogenic a-amylase Novamyl into a CGTase". 

SUMMARY OF THE INVENTION 

The Inventors have developed a method of modifying the amino acid sequence of a 
CGTase to obtain variants. The variants may form linear oligosaccharides as an initial product 
by starch hydrolysis and a reduced amount of cyclodextrin and may be useful for anti-staling in 
20 baked products. 

Accordingly, the invention provides a method of constructing CGTase variants based 
on a comparison of three-dimensional (3D) structures of the CGTase and a maltogenic alpha- 
amyiase. One or both models includes a substrate. The Invention also provides novel CGTase 
variants. 

25 BRIEF DESCRIPTION OF DRAWINGS 

Fig. 1 shows the results of a comparison of the 3D structures 1a47 for a CGTase (SEQ 
ID NO: 2) and 1qho for the maltogenic alpha-amyiase Novamyl (SEQ ID NO: 1). Details are de- 
' scribed in Example 1. 

DETAILED DESCRIPTION OF THE INVENTION 
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CGTase 

The method of the invention uses an amino acid sequence of a CGTase and a three- 
dimensional model for the CGTase. The model may include a substrate. 

The CGTase may have a three-dimensional structure found under the indicated identi- 
5 fier in the Protein Data Bank (www.rcsb.omV ft circulans (1CDG). alkalophliic Bacillus (1PAM), 
ft stearothermophilus (1CYG) or Thermoanaerobacterium thermosulfurigenes (1CIU, 1A47). 3D 
structures for other CGTases may be constructed as described in Example 1 of WO 9623874 . 

The CGTase may particularly have a sequence as found under the following accession 
numbers in the GeneSeqP database for CGTase from the indicated microorganism: 
10 1 . aab71493.gog fi. agaradherens 

2. aau76326.gcg Bacillus agaradhaerans 

3. cdg2_paema.gcg Paenibacillus macerans (Bacillus macerans). 

4. cdgl jaema.gog Paenibacillus macerans {Bacillus macerans). 

5. cdgtjhetu.gcg Thermoanaerobacter thermosulfurogenes (Clostridium thermosulfu- 
15 rogenes) 

6. aaw06772.gcg Thermoanaerobacter thermosulphurigenes sp. ATCC 53627 

7. cdgt_bacci.gcg Bacillus circulans 

8. cdgtjDacli.gcg Bacillus sp. (strain 38-2) 

9. cdgt_bacs3.gcg Bacillus sp. (strain 38-2) 
20 10. cdgtbacsO.gcg 8ac///us sp. (strain 101 1) 

11 cdgujbacci.gcg Bacillus circulans 

1 2. cdgtjbacsp.gcg Bacillus sp. (strain 17-1) 

13. cdgt_bacst.gcg Bacillus stearothermophilus 

14. cdgt_bacoh.gcg Bacillus ohbensis 

25 1 5. cdgt_bacs2.gcg Bacillus sp. (strain 1 -1 ) 

16. cdgtjdepn.gcg Klebsiella pneumoniae 

To develop variants of a CGTase without a known 3D structure, the sequence may be 
aligned with a CGTase having a known 3D structure. The sequence alignment may be done by 
conventional methods, e.g. by use the software GAP from UWGCG Version 8. 

30 Maltogenlc alpha-amytase 

The method also uses an amino acid sequence of a maltogenlc alpha-amylase (EC 
, 3.2.1.133) and a three-dimensional model of the maltogenic alpha-amylase. The model may 
include a substrate. The maltogenic alpha-amylase may have the amino acid sequence have 
the amino acid sequence shown in SEQ ID NO: 1 (In the following referred to as Novamyl). A 
35 3D model for Novamyl with a substrate is described in US 6162628 and is found in the Protein 
Data Bank with the identifier 1QHO. Alternatively, the maltogenic alpha-amylase may be a No- 
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vamyl variant described In US 6162628 a 3D structure of such a variant may be developed 
from the Novamyl structure by known methods, e.g. as described in T.L Blundell et al., Nature, 
vol. 326, p. 347 ff (26 March 1987); J. Greer. Proteins: Structure, Function and Genetics. 7:317- 
334 (1990); or Example 1 of WO 9623874 . 

5 Superimposition of 3D models 

The two 3D models may be superimposed by aligning the amino acid residues of each 
catalytic triad. This may be done by methods known in the art based on the deviations of the 
three pairs of C-alpha atoms, e.g. by minimizing the sum of squares of the three deviations or 
by aligning so as to keep each deviation below 0.8 A, e.g. below 0.6 A, below 0.4 A„ below 0.3 
10 A or below 0.2 A. 

Alternatively, the superimposition may be based on the deviations of all corresponding 
pairs of amino acid residues as shown in the alignment in Figs. 4-5 of WO 9943793 and bring- 
ing the sum of square of all deviations to a minimum. 



Selection of amino acid sequences 

15 In the superimposed 3D models, amino acid residues in the CGTase sequence are se- 

lected by two criteria: Firstly, CGTase residues < 10 A from a substrate (having a C-alpha atom 
located < 10 A from an atom of a substrate) are selected. Secondly, CGTase residues > 0.8 A 
from any maltogenlc alpha-amylase residue (having a C-atpha atom > 0.8 A from the C-alpha 
atom of any maltogenic alpha-amylase residue) are selected. 

20 Modifications of CGTase amino acid sequence 

One or more of the following modifications are made to the CGTase sequence: 

Deletion or substitution 

A CGTase residue < 10 A from a substrate and > 0.8 A from any residue In the malto- 
genic alpha-amylase sequence may be deleted or may be substituted with a different residue. 
25 The substitution may be made with the same amino acid residue as found at a corre- 

sponding position in the maltogenic alpha-amylase sequence or with a residue of the same 
type. The type indicates a positively charged, negatively charged, hydrophilic or hydrophobic 
. residue, understood as follows (Tyr may be hydrophilic or hydrophobic): 

Hydrophobic amino adds: Ala. Val. Leu, lie, Pro, Phe, Trp, Gly, Met, Tyr 
30« Hydrophilic amino acids: Thr, Ser, Gin, Asn, Tyr, Cys 

Positively charged amino acids: Lys, Arg, His 
Negatively charged amino acids: Glu, Asp 
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The CGTase residue may be substituted with a larger or smaller residue depending on 
whether a larger or smaller residue Is found at a corresponding position in the maltogenic alpha- 
amylase sequence. In this connection, the residues are ranked as follows from smallest to larg- 
est: (an equal sign indicates residues with sizes that are practically indistinguishable): 
5 G<A=S=C<V=T<P<L=I=N=D=M<E=Q<K<H<R<F< Y<W 

Also, a stretch (a "loop") of consecutive CGTase residues may be selected if each of 
the residues is > 0.8 A from any residue in the maltogenic atpha-amylase sequence and some 
of the CGTase residues is <10 A from a substrate. Such a stretch of CGTase residues may be 
deleted or substituted with different amino add residues. The substitution may be made with the 
10 residues found at the corresponding location in the maltogenic alpha-amylase sequence, with 
residues of the same type, or with an equal number of residues or one or two more or fewer 
residues than found in the maltogenic alpha-amylase sequence. 

Insertion 

One or more amino acid residues may be inserted at a position in the CGTase se- 
15 quence corresponding to one or more residues in the maltogenic alpha-amylase sequence 
which < 10 A from a substrate and which are > 0.8 A from any CGTase residue. The insertion 
may be made with the same residue or with an amino acid residue of the same type as the 
amino acid residue in the maltogenic alpha-amylase sequence. The type indicates a positively 
charged, negatively charged, hydrophilic or hydrophobic residue, as above. 
20 Where the maltogenic alpha-amylase sequence contains a stretch (a peptide loop) of 

residues < 10 A from a substrate and > 0.8 A from any CGTase residue, the insertion at the cor- 
responding position in the CGTase sequence may consist of an equal number of residues, or 
the insertion may have one or two fewer or more residues. Thus, in the case of a stretch of 5 
such residues in the maltogenic alpha-amylase sequence, the insertion may be made with 1-7 
25 residues, e.g. 1, 2 f 3, 4, 5, 6 or 7 residues. Each inserted residue may be the same as one of 
the maltogenic alpha-amylase residues or of the same type. 

Optional further modifications of the CGTase sequence 

Optionally, the CGTase sequence may be further modified by substituting one or more 
residues which is matched with a residue in the maltogenic alpha-amylase sequence. 
3a The substitution may be made with an amino acid residue of the same type (in particu- 

lar with the same residue) as the matching residue in the maltogenic alpha-amylase sequence. 

Depending on whether the matching residue in the maltogenic alpha-amylase se- 
quence is smaller or larger than the residue In the CGTase sequence, the substitution may be 
made with a smaller or larger residue (using the ranking shown above). 
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Production of CGTase variants 

A polypeptide having the resulting amino acid sequence may be produced by conven- 
tional methods, generally Involving producing DNA with a sequence encoding the polypeptide 
together with control sequences, transforming a suitable host organism with the DNA , cultlvat- 
5 ing the transformed organism at suitable conditions for expressing and optionally secreting the 
polypeptide, and optionally recovering the expressed polypeptide. 

DNA encoding any of the above CGTase variants may be prepared, e.g. by point- 
specific mutation of DNA encoding the parent CGTase. This may be followed by transformation 
of a suitable host organism with the DNA, and cultivation of the transformed host organism un- 
10 der suitable conditions to express the encoded polypeptide (CGTase variant). This may be done 
by known methods. 

Optional screening of CGTase variants 

Optionally, one or more expressed polypeptides may be tested for one or more useful 
enzymatic activities. This may include testing for the ability to hydrolyze starch or a starch de- 

15 rivative by a conventional method, e.g. a plate assay, use of Phadebas tablets or DSC on amy- 
lopectin. Further, the initial product from starch hydrolysis may be analyzed and a polypeptide 
producing an increased ratio of linear oligosaccharides to cyclodextrins may be selected. Also, 
the polypeptide may be tested by adding it to a dough, baking it and testing the firmness of the 
baked product during storage; a polypeptide with anH-staling effect may be selected as de- 

20 scribed in WO 9104669 or US 6162$28. Finally, the polypeptide may be tested for thermostabil- 
ity, and a more thermostable one may be preferred. 

Optional gene recombination 

Optionally, DNA encoding a plurality of the above CGTase variants may be prepared 
and recombined, followed by transformation of a suitable host organism with the recombined 
25 DNA, and cultivation of the transformed host organism under suitable conditions to express the 
encoded polypeptides (CGTase variants). The gene recombination may be done by known 
methods. 

CGTase variants 

Particularly, the CGTase may be modified by substitution, insertion or deletion of an 
30. amino add at a position corresponding to amino acid 85-95, 152, 184, 260-269, 285, 288, 314 
of the amino acid sequence shown in SEQ ID NO: 2 or 3. The modification may comprise sub- 
stitution or insertion of an amino acid residue with an amino acid residue of a corresponding po- 
sition in the amino acid sequence of Novamyl (SEQ ID NO: 1) or a deletion of an amino acid 
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residue In the region which Is not present et the corresponding position In the Novamyl se- 
quence. 

More particularly, the modification may comprise substitution of amino acids corre- 
sponding to amino acids 85-95, 260-268 or 260-269 of SEQ ID NO: 2 or 3 with TLAGTDN, 
5 YGDDPGTANHL or YGDDPGTANHLE, respectively. 

Some particular examples with the Thermoanaerobacfer CGTase (SEQ ID NO: 3) as 
an example are Y152F, F184W, R285D, Q288T, D314E. Corresponding substitutions may be 
made in other CGTases. 

Also, one or more additional modifications may be made, each being an amino acid 
10 substitution, insertion or deletion. In particular, such modification may be made in the regions 
corresponding to amino acids 40-43, 78-85, 136-139, 173-180, 189-195 or 258-268 of SEQ ID 
NO: 1. In particular, the modification may be an insertion of or a substitution with an amino acid 
present at the corresponding position of Novamyl, or a deletion of an amino acid not present at 
the corresponding position of Novamyl. Thus, taking the Thermoanaerobacter CGTase (SEQ ID 
15 NO: 3) as an example, one or more of the following changes may be made to introduce a loop 
modeled on Novamyl: 

• A85-S95 of SEQ ID NO: 3 is replaced by T80-N86 of SEQ ID NO: 1 , 

• N194-L198 of SEQ ID NO: 3 is replaced by N187-L196 of SEQ ID NO: 1, 

• Y260-P268 of SEQ ID NO: 3 is replaced by Y258-L268 of SEQ iD NO: 1 , or 
20 • Y260-N269 of SEQ ID NO: 3 is replaced by Y258-E269 of SEQ ID NO: 1 . 

The following are particular examples of variants based on the Thermoanaerobacter 
CGTase (SEQ ID NO: 3): 

Variant 1: Loop A85-S95 to Novamyl loop T80-N86, Loop N194-L198 to Novamyl Loop 
N187-L196, and Y152F 

25 Variant 2: As Variant 1 with addition of F184W, R285D, Q288T, D314E. and Loop 

Y260-P268 to Novamyl Loop Y258-L268. 

Variant 3: Loop A85-S95 to Novamyl loop T80-N86, Loop Y260-P268 to Novamyl loop 
Y258-L268. Y152F, G257D, R285D, Q288T, D314E. 

EXAMPLES 

30 Example 1: Construction of CGTase residues based on 3D structures 

Two 3D structures with substrates were used: 1 A47 for a CGTase (SEQ ID NO: 2) and 
1 QHO for a maltogenic alpha-amylase (Novamyl, SEQ ID NO: 1), wherein the substrates are 
indicated as GTE, GLC, CYL and GLD for 1a47 and as ABD for 1 qho. The two structures were 
superimposed by minimizing the sum of squares for deviations at the three C-alpha atoms at the 
35 catalytic triad: D230, E258 and D329 for 1A47, and D228, E256 and D329 for Novamyl. The 
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superimposed structures were analyzed, and the result is shown In Fig. 1 with the Novamyl se- 
quence at the top and the CGTase sequence below. 

The following CGTase residues were found to have a C-alpha atom < 10 A from an 
atom of either substrate: 19. 21. 24. 46-47. 75, 77-78, 82-83. 85-103, 106, 136-145, 152-153, 
5 182-187. 190-191. 193-200. 228-235. 257-267, 270, 282-289. 291-292, 296, 298, 324, 327-331, 
359, 369-375. They are indicated by the first underlining in Fig. 1. 

Two stretches Hoops") of consecutive residues were identified where some residues 
have the C-alpha atom < 10 A from an atom of either substrate and > 0.8 A from the C-alpha 
atom of any Novamyl residue. Including prefix and postfix, the two stretches are at residues 85- 
10 96 and 193-200 of the CGTase. 

The following CGTase residues were found to be included in either of the above sub- 
sets (<10 A from a substrate or In a loop) and to have a C-alpha atom > 0.8 A from the C-alpha 
atom of any Novamyl residue: 75, 77, 78. 85-94, 140, 144-145, 152, 182-187, 193-197, 235, 
262-266. 286-289, 292, 296. 298, 369-370. They are indicated by the second underlining in Fig. 
15 1. 

Variants were constructed by selecting residues in the CGTase of SEQ ID NO: 2 from 
residues with the second underlining in Fig. 1 and identifying the corresponding residues in the 
CGTase of SEQ ID NO: 3 from an alignment of the two CGTase sequences. As a result of the 
high degree of identity, the residues have the same numbers in the two sequences. The se- 
20 lected residues in SEQ ID NO: 3 were substituted as Indicated below. 

Variant 1 was created from SEQ ID NO: 3 as follows: CGTase residues A85-S95 were 
substituted with Novamyl residues T80-N86. CGTase residues N194-L198 were substituted with 
Novamyl residues N187-L196. Further, substitution Y152F was made to the CGTase sequence. 
Variant 2 was created as Variant 1 with the following additional substitutions in SEQ ID 
25 NO: 3: CGTase residues Y260-P268 were substituted with Novamyl residues Y260-P268. Fur- 
ther substitutions F184W, R285D. Q288T, and D314E were made to the CGTase sequence. 

Variant 3 was created from SEQ ID NO: 3 as follows: CGTase residues A85-S95 were 
substituted with Novamyl residues T80-N86. CGTase residues L261-P268 were substituted with 
Novamyl residues D261-L268. The following further substitutions were made in the CGTase se- 
30 quence: Y152F, G257D, R285D, Q288T and D314E. 
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CLAIMS 

1. A method of producing a variant polypeptide, which method comprises: 

a) providing an amino acid sequence and a three-dimensional model for a cydodextrin 
glucanotransferase (CGTase) and for an amino acid sequence for a maltogenic alpha- 

5 amylase wherein one or both models Includes a substrate, 

b) superimposing the two three-dimensional models, 

c) and modifying the amino acid sequence of the CGTase wherein the modification 
comprises: 

i) deleting an amino acid residue in the CGTase sequence which has a 
10 C-alpha atom < 10 A from an atom of a substrate and > 0.8 A from the C- 

alpha atom of any amino acid residue in the maltogenic aipha-amylase se- 

quence, . 

il) substituting an amino acid residue in the CGTase sequence which has 
a C-alpha atom < 10 A from an atom of a substrate and > 0.8 A from the C- 
15 alpha atom of any amino acid residue in the maltogenic aipha-amylase se- 

quence with a different amino acid residue, or 

iii) deleting or substituting a stretch of consecutive CGTase residues 
wherein each residue is > 0.8 A from any residue in the maltogenic aipha- 
amylase sequence and comprising at least one CGTase residue <10 A from a 

20 substrate, 

iv) inserting an amino acid residue at a position in the CGTase sequence 
corresponding to a maltogenic aipha-amylase sequence which has a C-alpha 
atom < 1 0 A from an atom of a substrate and > 0.8 A from the C-alpha atom of 
any CGTase residue, and 

25 d) producing the polypeptide having the resulting amino acid sequence. 

2. The method of claim 1 wherein the substitution is made with an amino acid residue of the 
same type as an unmatched amino acid residue at a corresponding position in the maltogenic 
aipha-amylase sequence, wherein the type is positively charged, negatively charged, hydro- 
phiiic or hydrophobic. 

30 3. The method of claim 1 wherein the insertion is made with an amino add residue of the same 
-type as an unmatched amino add residue at a corresponding position in the maltogenic aipha- 
amylase sequence, wherein the type is positively charged, negatively charged, hydrophilic or 
hydrophobic. 
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4. The method of any preceding claim wherein the modification of the amino acid sequence 
further comprises substitution of a matched amino acid residue in the CGTase sequence which 
has a C-a!pha atom located less than 10 A from an atom of a substrate with a different amino 
acid residue. 

5 5. The method of the preceding claim wherein the substitution of the matched amino acid resi- 
due Is made with an amino acid residue of the same type as the matching amino acid residue of 
the maltogenic alpha-amylase sequence, wherein the type is positively charged, negatively 
charged, hydrophilic or hydrophobic. 

6. The method of any preceding claim which further comprises preparing the variant polypep- 
10 tide, letting it act on starch, and selecting a variant polypeptide having the ability to form linear 

oligosaccharide as an initial product. 

7. A polypeptide which: 

a) has an amino acid sequence having at least 70% identity to a parent cyclodextrin 
glucanotransferase (CGTase); 
15 b) comprises insertion of an amino acid compared to the parent CGTase in a region 

corresponding to amino acids 194-198 of SEQ ID NO: 3, 

c) comprises an amino acid modification compared to the parent CGTase which is 
substitution, insertion or deletion of an amino acid at a position corresponding to amino 
acid 85-95, 152, 184, 260-269, 285, 288, 314 of the amino acid sequence shown in 

20 SEQ ID NO: 3, and 

d) has the ability to form linear oligosaccharides as an initial product when acting on 
starch. 

8. The polypeptide of claim 7 comprising insertion of 1-7 amino acids, particularly 5 amino ac- 
ids, more particularly insertion of DPAGF, most particularty between amino acids corresponding 

25 to 196 and 197 of SEQ ID NO: 3. 

9. The polypeptide of claim 7 or 8, further comprising substitution of an amino acid correspond- 
ing to any of amino acids 194-198 of SEQ ID NO: 3, particularly a substitution corresponding to 
.L195F. F196Tor D197S in SEQ ID NO: 3. 

• 10. The polypeptide of any preceding claim, comprising a substitution or insertion of an amino 
30 acid residue with an amino acid residue of a corresponding position in the amino acid sequence 
shown in SEQ ID NO: 1 or a deletion of an amino acid residue in the region which is not present 
at the corresponding position in the amino acid sequence shown in SEQ ID NO: 1. 
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1 1. The polypeptide of any preceding claim, comprising substitution of amino acids correspond- 
ing to amino acids 85-95, 260-268 or 260-269 of SEQ ID NO: 3 with TLAGTDN. 
YGDDPGTANHL or YGDDPGTANHLE, respectively. 

12. The polypeptide of any preceding claim comprising a substitution corresponding to Y152F, 
5 F184W, R285W, Q288T, D314E in SEQ ID NO: 3. 

13. A process for preparing a baked product which comprises adding the polypeptide of any 
preceding claim, or a polypeptide produced by the method of any of claims 1-6 to a dough and 
baking the dough to prepare the baked product, wherein the polypeptide is added in an amount 
which is effective to retard the staling of the baked product. 
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10340 Fig. 1 PVS 
1 10 20 30 40 50 60 70 

SSSASVKGDVIYQIIIDRFYDGDTTNNNPAKSYGLYDPTKSKWKMYWGGDLEGVRQKL- - PYLK 

ASDTAVSN WNYSTOVI YQIVTDR FVDGNTSNNPT- - -GDLYDPTHTSLKKYFGGDWQGIINKINDGYLT 67 

************ 

PVLDNLDTLAGT DhTTGYHGYWTRDFKQIEEHFGfftftnTFDTLVNDAHQNGIKVIVDFVPNHSTPFKA 

QPVENIYAVLPDSTFGGSTSYHGYWARDFKRTNPYFGSFTDFQNLINTAHAHNIKVIIDFAPNHTSPASE 137 



************* 

NDSTFAE6GAQLG\nTIWLSLYNNGTYMGNYFDDATKGYFHHNGDISNWDDRYEAQWKNFTDPAGFSLAD 
TDPTYA ENG RGMGVTAIWISL YONGTLLGGYTNDT- NGYFHHYGGTD - FS S YEOGI YRN L F DLAD 200 



LSQENGTX AQYLTDAAVQLVAHGADGLRXDAVKH FNSGFSKSLADKLYQKKDI FLVGEWYGDD- PGTANH 
LNQQNSTIDSYLK5AIKVWLDMGIDGIRLDAVKHMPFGWQKNFMDSILSYRPVFTFGEWFLG-TNEI — D 267 



LEKVRYANNSGVNVLDFDL^IRNVFOTFTQ™^^ 

VNhrnTFANESGMSLLDFRFSQKVRQVFRDNTDTWYGLDSMIQSTASDYNFINDMVTFIDNHDMDRFYN-G 336 



SNKANLH(^LAFILTSRGTPSIYYGTEQYMAGGNDPYNRGMMPAFDTTTTAFKEVSTLAGLRRNNAAIQY 
GSTRPVEQALAFTLTSRGVPAIYYGTEQYMTGNGDPYNRAMMTSFNTSTTAYNVIKKLAPLRKSNPAIAY 406 



GTTTQRWINNDVYI YERKFFNDWLVAINRNTQSSYSISGLQTAt PNGSYADYL SGL LGGNG I SVS -NGS 
GTTQQRWINNDVYIYERKFGNN VALVAI N RN LSTSYN ITGLYTAL PAGTYTDVLGGL LNGN SI SVASDGS 476 

VAS FTLAPGAVSVWQYST- S ASA PQIGS VA PNMGIPGN WTI DGKG FGTTQGTVTFGGVTATVKSItfTSN R 
VTPFTLSAGEVAVWQYVSSSN - S PL IGH VGPTMTKAGQTITIDGRG FGTTSGQVLFGSTAGTIVS WDDTE 545 

IEVWPNMAAGLTOVKVTA-GGVSSNLYS-YNILSGTQTSWFTVKSAPPTNLGDKIYLTGNIPELGNWS 
WVKVPSVTPGKYNISLKTSSGATSNTYNNIN^ 614 

TDTSGAVNNAQGPL LAP- - -NYPDWFYVFSVPAGKTIQFKFFIKRADGT-IQWENGSNHVATTPTGATGN 
IS- KAIGPMFNQWYQYPTWYYDVSVPAGTTIQFKFIKKN — GNTITWEGGSNHTYTVPSSSTGT 676 

rrvrwQN 

VIVNWQQ 683 



Fig. 1 
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10340-000. ST25 
SEQUENCE LISTING 

<110> Novozymes A/s 

<120> CGTASE VARIANTS 

<130> 10340-000 

<160> 3 

<170> Patentin version 3.2 

<210> 1 

<211> 686 

<212> PRT 

<213> Bacillus stearothermophilus 

<400> 1 

ser ser ser Ala ser val Lys Gly Asp val lie Tyr Gin lie lie lie 
1 5 10 15 

Asp Arg Phe Tyr Asp Gly Asp Thr Thr Asn Asn Asn Pro Ala Lys ser 
20 25 30 

Tyr Gly Leu Tyr Asp Pro Thr Lys ser Lys Trp Lys Met Tyr Trp Gly 
35 40 45 

Gly Asp Leu Glu Gly val Arg Gin Lys Leu Pro Tyr Leu Lys Gin Leu 
50 55 60 

Gly val Thr Thr lie Trp Leu ser Pro val Leu Asp Asn Leu Asp Thr 
65 70 75 80 

Leu Ala Gly Thr Asp Asn Thr Gly Tyr His Gly Tyr Trp Thr Arg Asp 
85 90 95 

Phe Lys Gin He Glu Glu His Phe Gly Asn Trp Thr Thr Phe Asp Thr 
100 105 110 

Leu val Asn Asp Ala His Gin Asn Gly lie Lys Val lie val Asp Phe 
115 120 125 

val Pro Asn His Ser Thr Pro Phe Lys Ala Asn Asp Ser Thr Phe Ala 
130 135 140 

Glu Gly Gly Ala Leu Tyr Asn Asn Gly Thr Tyr Met Gly Asn Tyr Phe 
145 150 155 160 

Asp Asp Ala Thr Lys Gly Tyr Phe His His Asn Gly Asp lie ser Asn 
165 170 175 

Trp Asp Asp Arg Tyr Glu Ala Gin Trp Lys Asn Phe Thr Asp Pro Ala 
180 185 190 

Gly Phe ser Leu Ala Asp Leu ser Gin Glu Asn Gly Thr lie Ala Gin 
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Tyr Leu Thr Asp Ala Ala val Gin Leu val Ala His Gly Ala Asp Gly 
210 215 220 

Leu Arg lie Asp Ala val Lys His Phe Asn ser Gly Phe ser Lys Ser 
225 230 235 240 

Leu Ala Asp Lys Leu Tyr Gin Lys Lys Asp lie Phe Leu val Gly Glu 
245 250 255 

Trp Tyr Gly Asp Asp Pro Gly Thr Ala Asn His Leu Glu Lys val Arg 
260 265 270 



Tyr Ala Asn Asn ser Gly val Asn Val Leu Asp Phe Asp Leu Asn Thr 

"5 



275 



280 



28 



val lie Arg Asn val Phe Gly^ Thr Phe Thr Gin Thr Met Tyr Asp Leu 



290 



29} 



300 



Asn Asn Met Val Asn Gin Thr Gly Asn Glu Tyr Lys Tyr Lys Glu Asn 
305 310 315 320 

Leu lie Thr Phe lie Asp Asn His Asp Met Ser Arg Phe Leu Ser val 
325 330 335 

Asn Ser Asn Lys Ala Asn Leu His Gin Ala Leu Ala Phe He Leu Thr 
340 345 350 

Ser Arg Gly Thr pro Ser lie Tyr Tyr Gly Thr Glu Gin Tyr Met Ala 
355 360 365 

Gly Gly Asn Asp Pro Tyr Asn Arg Gly Met Met Pro Ala Phe Asp Thr 
370 375 380 



Thr Thr Thr Ala Phe Lys Glu val ser Thr Leu Ala Gly Leu Arg Ar 
385 390 395 401 



Asn Asn Ala Ala lie Gin Tyr Gly Thr Thr Thr Gin Arg Trp lie Asn 
405 410 415 

Asn Asp val Tyr lie Tyr Glu Arg Lys Phe Phe Asn Asp val val Leu 
420 425 430 

val Ala He Asn Arg Asn Thr Gin Ser ser Tyr ser lie Ser Gly Leu 
435 440 445 

Gin Thr Ala Leu Pro Asn Gly ser Tyr Ala Asp Tyr Leu ser Gly Leu 
450 455 460 



Leu Gly Gly Asn Gly lie Ser Val ser Asn Gly Ser val Ala ser Phe 
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465 470 475 480 

Thr Leu Ala Pro Gly Ala val Ser val Trp Gin Tyr ser Thr ser Ala 
485 490 495 

ser Ala Pro Gin lie Gly ser val Ala Pro Asn Met Gly lie Pro Gly 
500 505 510 

Asn val val Thr He Asp Gly Lys Gly Phe Gly Thr Thr Gin Gly Thr 
515 520 525 

val Thr Phe Gly Gly val Thr Ala Thr val Lys ser Trp Thr ser Asn 
530 535 540 

Arg lie Glu Val Tyr val Pro Asn Met Ala Ala Gly Leu Thr Asp Val 
545 550 555 560 

Lys val Thr Ala Gly Gly val ser ser Asn Leu Tyr ser Tyr Asn He 
565 570 575 

Leu Ser Gly Thr Gin Thr Ser Val Val Phe Thr Val Lys Ser Ala Pro 
580 585 590 

pro Thr Asn Leu Gly Asp Lys lie Tyr Leu Thr Gly Asn He Pro Glu 
595 600 605 

Leu Gly Asn Trp ser Thr Asp Thr ser Gly Ala val Asn Asn Ala Gin 
610 615 620 

Gly Pro Leu Leu Ala Pro Asn Tyr pro Asp Trp Phe Tyr val Phe Ser 
625 630 635 640 

val Pro Ala Gly Lys Thr lie Gin Phe Lys Phe Phe lie Lys Arg Ala 
645 650 655 

Asp Gly Thr lie Gin Trp Glu Asn Gly ser Asn His val Ala Thr Thr 
660 665 670 

Pro Thr Gly Ala Thr Gly Asn lie Thr val Thr Trp Gin Asn 
675 680 685 

<210> 2 
<211> 683 
<212> PRT 

<213> Thermoanaerobacterium thermosulfuri genes 
<400> 2 

Ala ser Asp Thr Ala val ser Asn Val val Asn Tyr ser Thr Asp val 
15 10 15 

lie Tyr Gin lie val Thr Asp Arg Phe val Asp Gly Asn Thr ser Asn 
20 25 30 
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Asn pro Thr Gly Asp Leu Tyr Asp Pro Thr His Thr ser Leu Lys Lys 
35 40 45 

Tyr Phe Gly Gly Asp Trp Gin Gly lie He Asn Lys He Asn Asp Gly 
50 55 60 

Sr Leu Thr Gin pro val Glu Asn lie Tyr Ala Val Leu Pro Asp Ser 
70 75 80 

Thr Phe Gly Gly Ser Thr ser Tyr His Gly Tyr Trp Ala Arg Asp Phe 
85 90 95 

Lys Arg Thr Asn Pro Tyr Phe Gly Ser Phe Thr Asp Phe Gin Asn Leu 
100 105 110 

lie Asn Thr Ala His Ala His Asn He Lys val lie lie Asp Phe Ala 
115 120 125 

Pro Asn His Thr ser Pro Ala Ser Glu Thr Asp Pro Thr Tyr Ala Glu 
130 135 140 

Asn Gly Arg Gly Met Gly val Thr Ala lie Trp lie ser Leu Tyr Asp 
145 150 155 160 

Asn Gly Thr Leu Leu Gly Gly Tyr Thr Asn Asp Thr Asn Gly Tyr Phe 
165 170 1?5 



His His Tyr Gl^ Gly Thr Asp Phe ser ser Tyr Glu Asp Gly lie Tyr 



Arg Asn Leu Phe Asp Leu Ala Asp Leu Asn Gin Gin Asn ser Thr lie 
195 200 205 

Asp ser Tyr Leu Lys ser Ala lie Lys val Trp Leu Asp wet Gly lie 
210 215 220 



Asp Gly He Arg Leu Asp Ala val Lys His Met Pro Phe Gly Trp Gin 
225 230 235 240 



Lys Asn Phe Met Asp Ser lie Leu Ser Tyr Arg Pro Val Phe Thr Phe 
245 250 255 



Gly Glu Trp Phe Leu Gly Thr Asn Glu lie Asp val Asn Asn Thr Tyr 
260 265 270 

Phe Ala Asn Glu Ser Gly Met ser Leu Leu Asp Phe Arg Phe ser Gin 
275 280 285 



Lys val Arg Gin val Phe Arg Asp Asn Thr Asp Thr Met Tyr Gly Leu 
290 295 300 
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Asp ser Met lie Gin Ser Thr Ala ser Asp Tyr Asn Phe lie Asn Asp 
305 310 315 320 

Met val Thr Phe lie Asp Asn His Asp Met Asp Arg Phe Tyr Asn Gly 
325 330 335 

Gly ser Thr Arg Pro val Glu Gin Ala Leu Ala Phe Thr Leu Thr Ser 
340 345 350 

Arg Gly val Pro Ala lie Tyr Tyr Gly Thr Glu Gin Tyr Met Thr Gly 
355 360 365 

Asn Gly Asp Pro Tyr Asn Arg Ala Met Met Thr Ser Phe Asn Thr Ser 
370 375 380 

Thr Thr Ala Tyr Asn val He Lys Lys Leu Ala Pro Leu Arg Lys Ser 
385 390 395 400 

Asn Pro Ala lie Ala Tyr Gly Thr Thr Gin Gin Arg Trp lie Asn Asn 
405 410 415 

Asp val Tyr He Tyr Glu Arg Lys Phe Gly Asn Asn val Ala Leu val 
420 425 430 

Ala lie Asn Arg Asn Leu Ser Thr Ser Tyr Asn lie Thr Gly Leu Tyr 
435 440 445 

Thr Ala Leu Pro Ala Gly Thr Tyr Thr Asp val Leu Gly Gly Leu Leu 
450 455 460 

Asn Gly Asn Ser He ser val Ala Ser Asp Gly ser val Thr Pro Phe 
465 470 475 480 

Thr Leu ser Ala Gly Glu val Ala val Trp Gin Tyr val ser ser Ser 
485 490 495 



Asn ser Pro Leu lie Gly His val Gly Pro Thr Met Thr Lys Ala Gly 

505 



500 



510 



Gin Thr He Thr lie Asp Gly Arg Gly Phe Gly Thr Thr Ser Gly Gin 
515 520 525 

val Leu Phe Gly ser Thr Ala Gly Thr He val ser Trp Asp Asp Thr 
530 535 540 

Glu val Lys val Lys val Pro Ser val Thr Pro Gly Lys Tyr Asn lie 
545 550 555 560 



Ser Leu Lys Thr ser ser Gly Ala Thr Ser Asn Thr Tyr Asn Asn lie 
565 570 575 
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Asn lie Leu Thr Gly Asn Gin lie cys Val Arg Phe Val val Asn Asn 
580 585 590 

Ala ser Thr Val Tyr Gly Glu Asn Val Tyr Leu Thr Gly Asn Val Ala 
595 600 605 

Glu Leu Gly Asn Trp Asp Thr Ser Lys Ala lie Gly Pro Met Phe Asn 
610 615 620 

Gin val Val Tyr Gin Tyr Pro Thr Trp Tyr Tyr Asp val Ser val Pro 
625 630 635 640 

Ala Gly Thr Thr lie Gin Phe Lys Phe He Lys Lys Asn Gly Asn Thr 
645 650 655 

lie Thr Trp Glu Gly Gly Ser Asn His Thr Tyr Thr val Pro ser ser 
660 665 670 

ser Thr Gly Thr Val lie val Asn Trp Gin Gin 
675 680 

<210> 3 
<211> 683 
<212> PRT 

<213> Thermoanaerobacter sp. 
<400> 3 

Ala Pro Asp Thr ser val Ser Asn val val Asn Tyr ser Thr Asp Val 
1 5 10 15 

He Tyr Gin lie val Thr Asp Arg Phe Leu Asp Gly Asn Pro Ser Asn 
20 * 25 30 

Asn Pro Thr Gly Asp Leu Tyr Asp Pro Thr His Thr ser Leu Lys Lys 
35 40 45 

Tyr Phe Gly Gly Asp Trp Gin Gly lie lie Asn Lys lie Asn Asp Gly 
50 55 60 

Sr Leu Thr Gly Met Gly He Thr Ala lie Trp lie Ser Gin Pro Val 
70 75 80 

Glu Asn lie Tyr Ala val Leu Pro Asp ser Thr Phe Gly Gly ser Thr 
85 90 95 

Ser Tyr His Gly Tyr Trp Ala Arg Asp Phe Lys Lys Thr Asn Pro Phe 
100 * 105 110 

Phe Gly ser Phe Thr Asp Phe Gin Asn Leu lie Ala Thr Ala His Ala 
115 120 125 
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His Asn lie Lys Val lie He Asp Phe Ala Pro Asn His Thr Ser Pro 
130 13S 140 

Ala Ser Glu Thr Asp Pro Thr Tyr Gly Glu Asn Gly Arg Leu Tyr Asp 
145 150 155 160 

Asn Gly val Leu Leu Gly Gly Tyr Thr Asn Asp Thr Asn Gly Tyr Phe 
165 170 175 

His His Tyr Gly Gly Thr Asn Phe Ser Ser Tyr Glu Asp Gly lie Tyr 
180 185 190 

Arg Asn Leu Phe Asp Leu Ala Asp Leu Asp Gin Gin Asn ser Thr lie 
195 200 205 

Asp Ser Tyr Leu Lys Ala Ala lie Lys Leu Trp Leu Asp Met Gly He 
210 215 220 

Asp Gly lie Arg Met Asp Ala Val Lys His Met Ala Phe Gly Trp Gin 
225 230 235 240 

Lys Asn Phe Met Asp ser lie Leu Ser Tyr Arg Pro val Phe Thr Phe 
245 250 255 

Gly Glu Trp Tyr Leu Gly Thr Asn Glu val Asp Pro Asn Asn Thr Tyr 
260 265 270 

Phe Ala Asn Glu Ser Gly Met Ser Leu Leu Asp Phe Arg Phe Ala Gin 
275 " 280 285 

Lys val Arg Gin val Phe Arg Asp Asn Thr Asp Thr Met Tyr Gly Leu 
290 295 300 

Asp ser Met lie Gin Ser Thr Ala Ala Asp Tyr Asn Phe He Asn asp 
305 310 315 320 

Met val Thr Phe He Asp Asn His Asp Met Asp Arg Phe Tyr Thr Gly 
325 330 335 

Gly ser Thr Arg Pro val Glu Gin Ala Leu Ala Phe Thr Leu Thr Ser 
340 345 350 

Acg Gly val Pro Ala lie Tyr Tyr Gly Thr Glu Gin Tyr Met Thr Gly 
355 360 365 

Asn Gly Asp Pro Tyr Asn Arg Ala Met Met Thr Ser Phe Asp Thr Thr 
370 375 380 

Thr Thr Ala Tyr Asn val lie Lys Lys Leu Ala pro Leu Arg Lys ser 
385 390 395 400 
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31 n Lys 

405 ' 410 



Asn Pro Ala lie Ala Tyr Gly Thr Gin Lys Gin Arg Trp lie Asn Asn 

U0 415 



Asp val TVr lie Tyr Glu Arg Gin Phe Gly Asn Asn val Ala Leu Val 
420 ~ 425 430 

Ala lie Asn Arg Asn Leu Ser Thr ser Tyr Tyr He Thr Gly Leu Tyr 
435 440 445 

Thr Ala Leu Pro Ala Gly Thr TVr Ser Asp Met Leu Gly Gly Leu Leu 
450 455 460 

Asn Gly ser ser He Thr val ser Ser Asn Gly Ser val Thr Pro Phe 
465 470 475 480 

Thr Leu Ala Pro Gly Glu val Ala val Trp Gin Tyr Val ser Thr Thr 
485 490 495 

Asn Pro Pro Leu He Gly His val Gly Pro Thr Met Thr Lys Ala Gly 
500 505 510 

Gin Thr lie Thr lie Asp Gly Arg Gly Phe Gly Thr Thr Ala Gly Gin 
515 525 525 

val Leu Phe Gly Thr Thr Pro Ala Thr lie val ser Trp Glu Asp Thr 
530 535 540 

Glu val Lys val Lys val Pro Ala Leu Thr Pro Gly Lys Tyr Asn lie 
545 550 555 560 

Thr Leu Lys Thr Ala ser Gly val Thr ser Asn ser Tyr Asn Asn lie 
565 570 575 

Asn Val Leu Thr Gly Asn Gin val cys Val Arg Phe val val Asn Asn 
580 585 590 

Ala Thr Thr Val Trp Gly Glu Asn val Tyr Leu Thr Gly Asn val Ala 
595 600 605 

Glu Leu Gly Asn Trp Asp Thr ser Lys Ala lie Gly Pro Met Phe Asn 
610 615 620 

Gin val val Tyr Gin Tyr Pro Thr Trp Tyr Tyr Asp val Ser val Pro 
625 630 635 640 

Ala Gly Thr Thr lie Glu Phe Lys Phe lie Lys Lys Asn Gly ser Thr 
645 650 655 

Val Thr Trp Glu Gly Gly Tyr Asn His Val Tyr Thr Thr Pro Thr Ser 
660 665 670 

Page 8 



10340-000. ST25 

Gly Thr Ala Thr val He val Asp Trp Gin Pro 
675 680 
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